System and method for constructing a social network from multiple disparate, heterogeneous data sources

ABSTRACT

A computer implemented method of constructing a social network, the method including constructing the social network from a plurality of disparate, heterogenous data sources, wherein at least one of the plurality of disparate, heterogenous data sources includes a user generated data source; identifying a plurality of nodes and linkages; determining attributes of the nodes and linkages based on a plurality of disparate, heterogenous data sources, wherein the plurality of disparate, heterogenous data sources includes a combination of the user generated data source and at least one non-user generated source, wherein the attributes include at least one of a deterministic attribute, a probabilistic attribute, and a dynamic attribute; populating a mathemetical decision-making model based on the plurality of nodes and linkages, and the determined attributes of the plurality of nodes and linkages; determining attributes of the nodes and links at a second point in time; re-populating the mathematical decision-making model based on the plurality of nodes and linkages, and the determined attributes of the plurality of nodes and linkages at the second point in time.

CROSS-REFERENCE TO RELATED APPLICATION

The present application is related to U.S. patent application Ser. No.11/414,233, filed on May 1, 2006, to Chess, et al., entitled “SYSTEM ANDMETHOD FOR MEASURING BUSINESS TRANSFORMATION IMPACT USING SOCIAL NETWORKANALYTICS”, which is incorporated herein by reference, in its entirety.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention generally relates to a system and method forconstructing a social network, and more particularly, the presentinvention relates to a system, method, and framework for constructing asocial network from multiple, disparate, heterogeneous data sources, inwhich at least one of the data sources includes a user generated datasource.

2. Description of the Related Art

For purposes of the present application, the term “social network”generally means a social structure made of nodes which are generallyindividuals or organizations, and edges or links between them.

For purposes of the present application, the term “social computing”, orsocial network technology, generally means the use of social software.Social computing represents a growing trend of tools supporting socialinteraction and communication. For example, social computing can includeemail, instant messaging, blogs, wikis, auctions, web interactivecommunication or research, online social networking websites, etc.

A social network is a map of the relationships between individuals,indicating the ways in which they are connected through various socialfamiliarities ranging from casual acquaintance to close familial bonds.The term was first coined in 1954 by J. A. Barnes (in: Class andCommittees in a Norwegian Island Parish, “Human Relations”). Socialnetwork analysis (SNA) (also sometimes called network theory) hasemerged as a key technique in modern sociology, anthropology, socialpsychology and organizational studies.

Research in a number of academic fields has demonstrated that socialnetworks operate on many levels, from families up to the level ofnations, and play a critical role in determining the way problems aresolved, organizations are run, information is shared, and the degree towhich individuals succeed in achieving their goals.

Social networking also refers to a category of Internet applications tohelp connect friends, business partners, or other individuals togetherusing a variety of tools. These applications, known as online socialnetworks are becoming increasingly popular.

Generally, social network theory views social relationships in terms ofnodes and ties (or ties). Nodes are the individual actors within thenetworks, and linkages are the relationships between the actors.

There can be many kinds of linkages between the nodes. In its mostsimple form, a social network is a map of all of the relevant linkagesbetween the nodes being studied. The network can also be used todetermine the social capital of individual actors. These concepts areoften displayed in a social network diagram, where nodes are the pointsand linkages are the lines.

The shape of the social network helps determine a network's usefulnessto its individuals. Smaller, tighter networks can be less useful totheir members than networks with lots of loose connections (weak ties)to individuals outside the main network. More “open” networks, with manyweak ties and social connections, are more likely to introduce new ideasand opportunities to their members than closed networks with manyredundant ties. In other words, a group of friends who only do thingswith each other already share the same knowledge and opportunities. Agroup of individuals with connections to other social worlds is likelyto have access to a wider range of information. It is better forindividual success to have connections to a variety of networks ratherthan many connections within a single network. Similarly, individualscan exercise influence or act as brokers within their social networks bybridging two networks that are not directly linked (called fillingsocial holes).

The power of social network theory stems from its difference fromtraditional sociological studies, which assume that it is the attributesof individual actors that matter. Social network theory produces analternate view, where the attributes of individuals are less importantthan their relationships and ties with other actors within the network.This approach has turned out to be useful for explaining many real-worldphenomena, but leaves less room for individual agency, and the abilityfor individuals to influence their success, since so much of it restswithin the structure of their network.

Social networks have also been used to examine how companies interactwith each other, characterizing the many informal connections that linkexecutives together, as well as associations and connections betweenindividual employees at different companies. These networks provide waysfor companies to gather information, deter competition, and even colludein setting prices or policies.

Power within organizations, for example, generally has been found tocome more from the degree to which an individual within a network is atthe center of many relationships than actual job title. Social networksalso play a key role in hiring, in business success for firms, and injob performance.

Social networking websites (e.g., online social networks) have becomewidely used in virtual communities. In these communities, an initial setof founders sends out messages inviting members of their own personalnetworks to join the site. New members repeat the process, growing thetotal number of members and links in the network. Sites then offerfeatures such as automatic address book updates, viewable profiles, theability to form new links through “introduction services,” and otherforms of online social connections. Social networks can also beorganized around business connections.

Blended networking is an approach to social networking that combinesboth offline elements (face-to-face events) and online elements. Thenewest social networks on the Internet are becoming more focused onniches.

The following are some terms which generally are used in describingsocial networks.

The term “betweenness” generally means the degree an individual liesbetween other individuals in the network; the extent to which a node isdirectly connected only to those other nodes that are not directlyconnected to each other; an intermediary; liaisons; bridges. Therefore,“betweenness” generally means the number of people who a person isconnected to indirectly through their direct links.

The term “closeness” generally means the degree an individual is nearall other individuals in a network (directly or indirectly) and reflectsthe ability to access information through the “grapevine” of networkmembers. Thus, closeness is the inverse of the sum of the shortestdistances between each individual and every other person in the network.

The term “degree” generally means the count of the number of linkages orties to other actors in the network.

The term “Eigenvector Centrality” generally is a measure of theimportance of a node in a network. It generally assigns relative scoresto all nodes in the network based on the principle that connections tonodes having a high score contribute more to the score of the node inquestion.

The term “clustering coefficient” generally means a measure of thelikelihood that two associates of a node are associates themselves. Ahigher clustering coefficient indicates a greater ‘cliquishness’.

The term “cohesion” generally means the degree to which actors areconnected directly to each other by cohesive bonds. Groups generally areidentified as ‘cliques’ if every actor is directly tied to every otheractor, or ‘social circles’ if there is less stringency of directcontact.

The term “individual-level density” generally means the degree to whicha respondents linkages know one another, or the proportion of linkagesamong an individual's nominees. The term “network or global-leveldensity” is the proportion of linkages in a network relative to thetotal number possible (sparse versus dense networks).

The term “group degree centralization” generally means a measure ofgroup dispersion or how network links focus on a specific node or nodes.

The term “radiality” generally means the degree an individual's networkreaches out into the network and provides novel information andinfluence

The term “reach” generally means the degree any member of a network canreach other members of the network.

The term “structural equivalence” generally means the extent to whichactors have a common set of linkages to other actors in the system. Theactors don't need to have any linkages to each other to be structurallyequivalent.

The term “static holes” generally means structural holes that can bestrategically filled by connecting one or more links to link togetherother points. Linked to ideas of social capital: if you link to twopeople who are not linked you can control their communication.

Conventional methods generally rely on only one, or only homogeneoussources of data to construct the social network. The problem is that theanalysis performed based on the derived social network is limited.

Furthermore, the conventional methods do not consider more than onesource of user generated information, nor do they consider usergenerated sources in combination with non-user generated sources.

Conventional methods generally describe construction of social networkswith multiple kinds of edges, reflecting different kinds ofrelationships. However, the conventional methods use only one datasource used to construct the edges.

Other conventional methods use SNA to build knowledge maps, which areconstructs within the area of knowledge management. Such conventionalmethods borrow from the standard practice of SNA (and otherdisciplines), but do not suggest or extend construction methods,according to the exemplary aspects of the present invention.

SUMMARY OF THE INVENTION

In view of the foregoing and other exemplary problems, drawbacks, anddisadvantages of the related art methods and structures, an exemplaryfeature of the present invention is to provide a system, method, andframework for constructing a social network from multiple, disparate,and heterogeneous data sources, wherein at least one data sourceincludes a user generated data source.

The present inventors have recognized that conventional social networkanalysis can be improved significantly by providing a richness of dataderived from multiple, disparate, heterogeneous data sources, wherein atleast one data source includes a user generated data source.

The present invention recognizes that conventional social networkconstruction may result in limited analysis because of limited abilityto triangulate and verify information, and to eliminate inconsistencies.

For purposes of the present application, the term “user” means the sameentities that become actors/nodes in the social network. The term“heterogeneous” generally means that at least one attribute is not incommon. The term “disparate” generally means that all attributes are notin common.

The problem with the conventional approaches is that user generated datais subjective by nature, and construction based on the multiple,heterogeneous sources allows for triangulation and provides a means forconsistency check. That is, constructed social networks are thereforemore reliable representations. The increased accuracy allows forimproved analysis and greater potential for diagnosis and prescriptiveuse.

Also, the present invention can perform social network optimizationbased on the rich data obtained from such multiple, disparate,heterogeneous data sources.

For example, in one exemplary aspect of the invention, a computerimplemented method of constructing a social network, includesconstructing the social network from a plurality of disparate,heterogeneous data sources.

In another exemplary aspect of the invention, a system for constructinga social network, includes a constructing unit that constructs saidsocial network from a plurality of disparate, heterogeneous datasources.

In another exemplary aspect of the invention, a system for constructinga social network, includes means for identifying a plurality of nodesand linkages of the social network, and means for determining attributesof the nodes and linkages based on a plurality of disparate,heterogeneous data sources.

In another exemplary aspect of the invention, a method of deployingcomputing infrastructure in which recordable, computer-readable code isintegrated into a computing system, and combines with the computingsystem to perform a method of constructing a social network from aplurality of disparate, heterogeneous data sources.

In another exemplary aspect of the invention, a signal-bearing mediumtangibly embodying a program of recordable, machine-readableinstructions executable by a digital processing apparatus to perform andmethod of constructing a social network from a plurality of disparate,heterogeneous data sources.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing and other exemplary purposes, aspects and advantages willbe better understood from the following detailed description of anexemplary embodiment of the invention with reference to the drawings, inwhich:

FIG. 1 illustrates a social network 100 according to an exemplary,non-limiting embodiment of the present invention;

FIG. 2 illustrates a system 200 according to another exemplary,non-limiting embodiment of the present invention;

FIG. 3 illustrates a method 300 according to an exemplary aspect of theinvention;

FIG. 4 illustrates a method 400 according to an exemplary aspect of theinvention;

FIG. 5 graphically illustrates an exemplary system 500 according to anexemplary aspect of the invention;

FIG. 6 graphically illustrates an exemplary system 600 according to anexemplary aspect of the invention;

FIG. 7 graphically illustrates an exemplary system 700 according to anexemplary aspect of the invention;

FIG. 8 illustrates an exemplary hardware/information handling system 800for incorporating the present invention therein; and

FIG. 9 illustrates a signal bearing medium (e.g., storage medium 900)for storing/recording steps of a program of a method according to thepresent invention.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS OF THE INVENTION

Referring now to the drawings, and more particularly to FIGS. 1-9, thereare shown exemplary embodiments of the method and structures accordingto the present invention.

The present invention generally relates to a system and method forperforming social network analysis (SNA), which has emerged as a keytechnique in modern sociology, anthropology, social psychology andorganizational studies. SNA is also emerging as a consulting methodologyfor understanding business processes, communication patterns within andbetween businesses, communities of practice, and customer markets.

As mentioned above, a social network generally means a social structuremade of nodes and edges or links between them. The nodes of the socialnetwork generally identify individuals or organizations. The links ofthe social network generally demonstrate relationships between pairs ofnodes (e.g., between the individuals and/or organizations). An edgegenerally means an undirected link between two nodes, and an arcrepresents a directed link between two nodes. For example, node A goesto node B for information.

As mentioned above, the term “social computing”, or social networkingtechnology, generally means the use of social software. Social computingrepresents a growing trend of tools supporting social interaction andcommunication. For example, social computing can include email, instantmessaging, blogs, wikis, auctions, web interactive communication orresearch, online social networking websites, etc.

The present invention relates to a method and system for constructing asocial network from multiple, disparate, heterogeneous data sources.

The present invention also exemplarily provides a method and system forperforming optimization based on social network analysis to performbusiness decisions and allocate resources based on the social networkwhich is constructed from multiple, disparate, heterogeneous datasources.

FIG. 1 exemplarily illustrates a social network, according to thepresent invention, in which optimal allocation of limited resources canbe performed to improve the connectivity between two groups.

The present invention has recognized that nodes and links/edges can havevarious attributes. These attributes can be used to populate datasources for constructing the social network. For example, the presentinvention provides a method and system for constructing a social networkfrom multiple, disparate, and heterogeneous data sources.

The present invention can provide automated (e.g., scrapping, parsing)collection of data combined with traditional survey methods for socialnetwork construction.

Thus, the present invention has an important feature in that a richnessof attributes can be provided. The conventional systems and methodscannot, and do not, provide such attribute richness, or for that matter,provide decision making based on such rich attributes.

According to the present invention, the rich attributes of the nodesand/or links/edges can be identified and used to populate multiple,disparate, and heterogeneous data sources for constructing the socialnetwork. Such attributes of the nodes and/or links/edges can include,for example, deterministic attributes, probabilistic attributes, dynamiccharacterization, etc.

For example, the present invention can capture dynamic social networkaspects for the network components (e.g., the nodes and/or links/edges).

According to the present invention, the attributes can be related to thepeople or organizations themselves (i.e., nodes) or related to thelinkages among the nodes.

Examples of attributes (or metrics) associated with nodes can include,among others, title, department, number years with company, resume,telephone number, e-mail address, physical office location, education,experience, past projects, gender, languages spoken, knowledge ofcomputer programming languages, etc.

Examples of attributes (or metrics) associated with linkages caninclude, among others, how people collaborate, patterns ofcommunication, frequency of communication, information sharing,decision-making and innovation within a particular organization orgroup, or between particular nodes, how the nodes know each other (e.g.,through work, soccer, co-authoring a patent, co-authoring a paper,etc.), brokering between nodes, cliques formed among the nodes, pathlengths of communication between nodes, density, etc.

An “edge” generally means a pairing of two nodes. An edge can be a uni-or bi-directional link between two nodes. Each edge also can haveattributes, such as how person A knows person B, or that persons A and Bknow each other because they play soccer together, work in the samedepartment, co-authored a paper together, co-inventors on a patent, etc.Other examples of edge attributes include strength of relationship,frequency of communication, probability of communicating in the future,level of trust of person A by person B, etc.

An important aspect of the present invention is providing a richness ofdata for populating the social network. To provide such rich data, theexemplary method and system of the present invention can construct asocial network from multiple, disparate, and heterogeneous data sources.

With reference to FIG. 2, an exemplary system according to the presentinvention can include a social network analysis unit 240, which receivesinput from a plurality of disparate, heterogeneous data sources (e.g.,225, 230, 245).

The present invention can provide automated collection (e.g., scrapping,parsing, etc.) combined with traditional user-generated (e.g., survey)methods for social network construction.

For example, with reference again to FIG. 2, data 230 can be derived (orautomatically collected) from social computing units 205, 210, and 215).The social computing units 205, 210, and 215, can include, for example,email, instant messaging, blogs, wikis, auctions, web interactivecommunication or research, online social networking websites, etc.

On the other hand, data 225 can be derived from user generated data 220(e.g., traditional surveys, a plurality of user generated data sources,etc.). In one aspect of the invention, the data sources include at leastone user generated data source (e.g., a survey, etc.) and at least onenon-user generated data source.

An exemplary method according to the present invention is described withreference to FIG. 3.

For example, according to the exemplary aspects of the presentinvention, a survey can be administered to a group of participants of anevent (e.g., for mixing people of different backgrounds andorganizations) prior to the event to obtain a plurality of usergenerated data. Another survey can be administered after the event,and/or after a predetermined period of time has elapsed from the time ofthe event.

Since some of the participants will have interacted at the event, andpossibly gotten to know each other during the event, connections mayhave been made. As another example, some participants may have obtainedresearch ideas from participants who deal with clients, while others mayderive client proposal ideas from research participants.

The present invention can perform social network analysis of theattendees based on survey information before the event, surveyinformation after the event, and time delayed follow up surveyinformation, which may include whether the participants are or have nowtalked or worked together. Also, secondary interactions/connections canbe taken into account, such as participants connecting with othersthrough other participants, or by word of mouth/e-mail, etc.

The ordinarily skilled artisan would know and understand that other data245 also can be derived or extracted from a variety of other sources,such as directories, etc.

The present invention can construct a social network from a plurality ofdisparate, heterogeneous data sources, such as survey data (e.g., aplurality of user generated data sources), social computing data, andcombinations thereof. Hence, the present invention can provide attributerichness, including deterministic and probabilistic attributes, as wellas capturing dynamic social network aspects (i.e., dynamiccharacterization of network components (e.g., nodes and linkages)) byextracting or obtaining data from disparate, heterogeneous data sources.

The aforementioned exemplary linkages between people can providevaluable metrics and can provide disparate, heterogeneous data to beused to compare the before and after states of the nodes and linkages ofthe social network and make business decisions.

With reference to the exemplary method illustrated in FIG. 3, a computerimplemented method of constructing a social network includes, forexample, constructing the social network from a plurality of disparate,heterogeneous data sources (e.g., see 310).

With reference to the exemplary method illustrated in FIG. 4, a computerimplemented method of constructing the social network includes, forexample, identifying a plurality of nodes and linkages (e.g., see 410)and determining attributes of the nodes and linkages based on theplurality of disparate, heterogeneous data sources (e.g., see 420). Theattributes can include, for example, at least one of a deterministicattribute, a probabilistic attribute, and a dynamic attribute.

With reference again to FIG. 3, the present invention also can providesocial network optimization to the analytics provided by the socialnetwork analysis constructed from the plurality of disparate,heterogeneous data sources (e.g., see 320). In one aspect of theinvention, the data sources include at least one user generated datasource (e.g., a survey, etc.) and at least one non-user generated datasource. Thus, social network optimization can be performed to makebusiness decisions to use the information, for example, to identifyplaces in social network that merit focus, to campaign in a certain way,etc.

With reference again to FIG. 4, the present invention exemplarily canpopulate a mathematical decision-making model based on the attributes(e.g., see 430)(e.g., to perform social network analysis).

The present invention can determine attributes of the nodes and linkagesfrom a plurality of disparate, heterogeneous data sources at anotherpoint in time (e.g., a second point in time after the firstdetermination of attributes is made)(e.g., see 440). The mathematicaldecision-making model can then be re-populated based on the second setof attributes (e.g., see 450)(e.g., SNA can be re-performed). Thisprocess of determining attributes at different points in time andre-populating the decision-making model can be repeated, as exemplarilyillustrated in FIG. 4 (e.g., SNA can be repeated).

FIG. 5 exemplary illustrates a system for solving crime using socialnetwork optimization, according to an exemplary aspect of the invention.

FIG. 6 exemplary illustrates a system for identification ofcommunication patterns within a terrorist network (Mar. 12, 2006, NYTimes Magazine).

FIG. 7 exemplarily illustrates analyzing and diagnosing collaborationbarriers and risks in organizational networks, using social networkoptimization according to an exemplary aspect of the present invention.

Another exemplary aspect of the invention relates to a system forconstructing a social network, including means for identifying aplurality of nodes and linkages of the social network, and means fordetermining attributes of the nodes and linkages based on a plurality ofdisparate, heterogeneous data sources.

While the invention is exemplarily described with respect to theseexemplary services, those skilled in the art will recognize that theinvention is not limited to the exemplary embodiments.

FIG. 8 illustrates an exemplary hardware/information handling system 800for incorporating the present invention therein, and FIG. 9 illustratesa signal bearing medium 900 (e.g., storage medium) for storing steps ofa program of a method according to the present invention.

FIG. 8 illustrates a typical hardware configuration of an informationhandling/computer system for use with the invention and which preferablyhas at least one processor or central processing unit (CPU) 811.

The CPUs 811 are interconnected via a system bus 812 to a random accessmemory (RAM) 814, read-only memory (ROM) 816, input/output (I/O) adapter818 (for connecting peripheral devices such as disk units 821 and tapedrives 840 to the bus 812), user interface adapter 822 (for connecting akeyboard 824, mouse 826, speaker 828, microphone 832, and/or other userinterface device to the bus 812), a communication adapter 834 forconnecting an information handling system to a data processing network,the Internet, an Intranet, a personal area network (PAN), etc., and adisplay adapter 836 for connecting the bus 812 to a display device 838and/or printer 839.

In addition to the hardware/software environment described above, adifferent aspect of the invention includes a computer-implemented methodfor performing the above method. As an example, this method may beimplemented in the particular environment discussed above.

Such a method may be implemented, for example, by operating a computer,as embodied by a digital data processing apparatus, to execute asequence of machine-readable instructions. These instructions may residein various types of signal-bearing media.

This signal-bearing media may include, for example, a RAM containedwithin the CPU 811, as represented by the fast-access storage forexample. Alternatively, the instructions may be contained in anothersignal-bearing media, such as a data storage disk/diskette 900 (FIG. 9),directly or indirectly accessible by the CPU 811.

Whether contained in the disk/diskette 900, the computer/CPU 811, orelsewhere, the instructions may be stored on a variety ofmachine-readable data storage media, such as DASD storage (e.g., aconventional “hard drive” or a RAID array), magnetic tape, electronicread-only memory (e.g., ROM, EPROM, or EEPROM), an optical storagedevice (e.g. CD-ROM, WORM, DVD, digital optical tape, etc.), paper“punch” cards, or other suitable signal-bearing media includingtransmission media such as digital and analog and communication linksand wireless. In an illustrative embodiment of the invention, themachine-readable instructions may comprise software object code,compiled from a language such as “C”, etc.

While the invention has been described in terms of several exemplaryembodiments, those skilled in the art will recognize that the inventioncan be practiced with modification within the spirit and scope of theappended claims. For example, the ordinarily skilled artisan would knowand understand that the present invention can include other datasources, such as all media sources (e.g., video imagery, audio, etc.)which can be converted to digital format and data mined/interpreted.

Further, it is noted that, Applicants' intent is to encompassequivalents of all claim elements, even if amended later duringprosecution.

1. A computer implemented method of constructing a social network, themethod comprising: constructing said social network from a plurality ofdisparate, heterogeneous data sources, wherein at least one of saidplurality of disparate, heterogeneous data sources includes a usergenerated data source; identifying a plurality of nodes and linkages;determining attributes of said nodes and linkages based on saidplurality of disparate, heterogeneous data sources, wherein saidplurality of disparate, heterogeneous data sources includes acombination of said user generated data source and at least one non-usergenerated source, wherein said attributes comprise at least one of: adeterministic attribute, a probabilistic attribute, and a dynamicattribute; populating a mathematical decision-making model based on theplurality of nodes and linkages, and the determined attributes of saidplurality of nodes and linkages; determining attributes of said nodesand linkages at a second point in time; re-populating said mathematicaldecision-making model based on the plurality of nodes and linkages, andthe determined attributes of said plurality of nodes and linkages atsaid second point in time; wherein said user generated data sourceincludes at least one survey data source; analyzing and diagnosingcollaboration barriers and risks between said plurality of nodes of saidsocial network; wherein said determining attributes of said nodes andlinkages at a second point in time comprises: collecting data from saidplurality of disparate, heterogeneous data sources, wherein saidplurality of disparate, heterogeneous data sources further comprises: atleast one of a survey data, a social computing data, and a combinationthereof.